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We describe the Library Event Matching classification algorithm implemented for use in the NOvA Ve oscillation measure¬ 
ment. Library Event Matching, developed in a different form by the earlier MINOS experiment, is a powerful approach in which 
input trial events are compared to a large library of simulated events to find those that best match the input event. A key feature 
of the algorithm is that the comparisons are based on all the information available in the event, as opposed to higher-level derived 
quantities. The final event classifier is formed by examining the details of the best-matched library events. We discuss the concept, 
definition, optimization, and broader applications of the algorithm as implemented here. Library Event Matching is well-suited to 
the monolithic, segmented detectors of NOvA and thus provides a powerful technique for event discrimination. 
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1. Introduction 

Classifying images into a small number of categories is a 
common task in scientific and industrial fields. In particle 
physics, this task usually involves interpreting particle detec¬ 
tor data to determine the type of particles, interactions, or de¬ 
cays present. Given the sheer volume of information that can be 
collected, the data is often first reduced to a set of derived quan¬ 
tities by running algorithms that pull out key features: clusters, 
tracks, showers, jets, etc. While this form of lossy compression 
is acceptable in some applications, it is worth exploring whether 
a classification scheme that uses all of the available information 
is feasible, even in cases where the data volume is high. 

In this article we describe such a classification scheme de¬ 
veloped to categorize neutrino scattering events recorded in the 
NOvA detectors. In the Library Event Matching (LEM) algo¬ 
rithm, a trial event of unknown type is compared to a large num¬ 
ber of known “library” events to find those events that are most 
similar to the trial event. The properties of those best-matched 
library events reveal the likely nature of the trial event. A distin¬ 
guishing feature of LEM is that the comparisons are made using 
the energy depositions directly, to avoid any information loss 
from calculating higher-level variables. This fundamental phi¬ 
losophy of LEM was developed within the MINOS collabora¬ 
tion for its own neutrino event categorization needs iSlSii. 
The LEM version described in this article has substantial differ¬ 
ences from its predecessor, many of which are motivated by the 
higher spatial resolution of the NOvA detectors. 

While we use NOvA as our case study, the approach dis¬ 
cussed is generalizable and could be usefully applied to any 
highly segmented detector, from hadron calorimeters determin¬ 
ing jet multiplicity to cubic kilometer arrays collecting neutri- 
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Figure 1: A sketch of the structure of the NOvA detectors. 4 cm x 6 cm cells 
run the length of each 16 m x 16 m plane. The alternating vertical and hori¬ 
zontal orientations can be seen. They are hlled with liquid scintillator and each 
contains a looped wavelength-shifting fiber (not shown), as described in the 
text. This cut-away sketch is diagrammatic only. The real cells have rounded 
comers and the ends of the cells are capped for instrumentation and oil contain¬ 
ment purposes. The neutrino beam is incident from the left. 

nos from astrophysical sources. As with many machine learn¬ 
ing algorithms, LEM requires a large number of known exam¬ 
ples from each classification category. In particle physics ap¬ 
plications, these would typically come either from an advanced 
Monte Carlo simulation or from calibration sources. 

2. The NOvA experiment 

The NOvA (NuMI Off-axis Vg Appearance) experiment stud¬ 
ies the phenomenon of neutrino flavor oscillation ||3l. Neutrinos 
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Figure 2: Example simulated events in the NOvA detectors. Only one of the 
two views is shown in each case. Each box represents one cell and is positioned 
according to its plane number (horizontal axis) and cell number (vertical axis). 
The color scale indicates the charge deposited in photoelectrons, and is common 
to all three panels, (a) A CC event, with the electron-induced electromagnetic 
shower clearly visible, (b) A neutral cun'ent event with a . The upper track 
is due to a proton. This event shows that the two showers from —»yy are not 
always distinct, (c) A CC event, with the usual tell-tale long, straight muon 
track. Note that the axis ranges are approximately doubled for this panel relative 
to the first two. 


produced by the NuMI beamline at Fermilab |@] are observed 
by a Near Detector on the Fermilab site and by a Far Detector 
of identical construction located 810 km downstream in Ash 
River, Minnesota. For the purposes of this article, the neutrino 
oscillation mode of interest is ^ Ve, and the goal of the clas¬ 
sification algorithm is to obtain a sample of electron neutrino 
interactions in the Far Detector with the highest possible effi¬ 
ciency and purity. 

The NOvA detectors are constructed from long PVC cells 
filled with scintillator-doped mineral oil. Each of the Far De¬ 
tector’s 344,064 cells is 16 m long with rectangular cross sec¬ 
tion 4 cm X 6 cm. A loop of wavelength-shifting fiber runs the 
length of each cell, with both ends of the fiber terminating at 
one pixel of a 32-pixel APD array. The body of the 14-kiloton 
detector consists of 896 layers, or “planes”, each with 384 cells. 
Each plane is 16mx 16m square, and the depth of the detector 
along the beam direction is 60 m. Alternate planes are aligned 
vertically and horizontally so that three-dimensional informa¬ 
tion can be obtained through combination of the two “views”. 
The detector has unprecedented granularity for its size, with 
one radiation length (38 cm) extending over many cells, to give 
a detailed view of neutrino-induced electromagnetic showers. 
Eigure [T] shows a cut-away diagram of the detector’s construc¬ 
tion. 

The signal for the ^ Vg oscillation analysis in NOvA is 
Ve charged-current (CC) scattering, which yields a high-energy 
electron in the final state that allows one to tag the incident 


neutrino’s flavor. In the 1 to 3 GeV energy range of NOvA, this 
electron will be accompanied, with similar probabilities, by a 
proton (quasi-elastic scattering), a nucleon plus apion (resonant 
scattering), or a richer hadronic shower (deep inelastic scatter¬ 
ing). While nuclear effects blur these crisp definitions, these 
three scattering types are useful for conveying the variety of 
shapes that signal events in NOvA can take. The ~1 GeV elec¬ 
tron in the final state produces an electromagnetic shower in the 
detector that has a width of a few cells and runs longitudinally 
an average distance of 2.5 m (40 planes). Eigure |2^ shows a 
simulated Vg CC interaction in the NOvA Ear Detector. 

The primary mis-identification background comes from 
neutral-current (NC) interactions, particularly those where the 
recoil hadronic system contains a The decays quickly to 
two photons, each of which induces an electromagnetic shower 
that is essentially indistinguishable from an electron-induced 
shower. NC events, taken as a whole, look sufficiently dif¬ 
ferent from signal Vg CC events that we can reject them well, 
but the differences are sometimes obscured: 

• The presence of two electromagnetic showers, rather than 
one, can reveal a in the final state. However, if one of 
the showers has low energy or overlaps the other in the 
detector, it can be missed. 

• Photon-induced showers are separated from the neutrino 
interaction point due to the distance traveled by the pho¬ 
ton prior to its conversion. This gap is a tell-tale sign of 
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Figure 3: Signal and background distributions of visible energy expected in the 
Far Detector sample. The effect of neutrino oscillations is included. Visible 
energy is defined as the incident neutrino energy except in the case of neutral 
current events where the outgoing neutrino energy is subtracted. The sig¬ 
nal to be identified by LEM is shown in red. The neutral current, charged 
cuiTent, and intrinsic beam Vg charged cun'ent components ai‘e blue, black, and 
magenta respectively. 


a photon, but in some cases the gap will be too small to 
resolve. The conversion length in NOvA is 50 cm. 

• Photon-induced showers begin with two particles (an elec¬ 
tron/positron pair) rather than one, but these cases can end 
up indistinguishable given the energy resolution of the de¬ 
tector. 

• The energy lost to the outgoing neutrino in NC scatter¬ 
ing leads to reconstructed energies lower than those of sig¬ 
nal events. However, interactions from a sufficiently high- 
energy neutrino or with a large energy transfer can fall in 
the signal region of 1 to 3 GeV reconstructed energy. 


3. Library Event Matching concept 


At the heart of the LEM algorithm is the comparison of each 
unknown trial event to a large number of known library events, 
with the comparisons based on low-level information collected 
by the detector. For NOvA, this means using the calibrated en¬ 
ergy depositions in all the detector cells directly rather than 
forming higher-level objects such as showers and tracks from 
those. 

Once the very best matches are found (here, the best 0.0001% 
of all library events), their known properties are used to esti¬ 
mate the properties of the trial event. In the simplest version of 
LEM, the fraction of the best matches that are signal events can 
be used as the discriminant. Appendix A.l discusses the rela¬ 
tionship between LEM and other machine learning techniques. 


3.1. The matching metric: motivation 

When comparing two events, a metric is needed to quantify 
how similar they are. It is instructive to look at the MINOS case 
briefly, as the situation there is somewhat simpler HI 12. S 01 • 

The MINOS detector has a segmented structure analogous to 
that of the NOvA detector, but the effective spatial resolution for 
events of interest is significantly lower. A Ve CC signal event in 
MINOS involves only a couple dozen active “strips” (the ana¬ 
logue of NOvA’s cells), and these active strips are clustered in a 
relatively compact pattern. Thus, two events with the same un¬ 
derlying particle kinematics have a good chance of having iden¬ 
tical (or near-identical) arrangements of active strips. The read¬ 
out electronics report the number of photoelectrons detected in 
each active strip. Since this charge measurement suffers from 
shot noise (typical charge: ~8 photoelectrons), strips with iden¬ 
tical energy depositions may report different charges. The level 
of difference is governed by Poisson statistics. 

These details guided the form of matching metric used by 
MINOS, which can be thought of as the likelihood X that the 
two events’ recorded charges represent the same underlying en¬ 
ergy depositions: 


Figure|2j) shows a simulated NC event with a 7r«. 

Additional background comes from v^ CC scattering, which 
produces a muon in the final state. The muon leaves a long track 
of activity in the detector with a characteristic energy deposition 
per unit pathlength. These are readily removed from the sample 
due to the clear muon track except in cases where the muon is 
low in energy or is lost amongst other activity. In these cases, 
the background is similar to NC interactions, with neutral pions 
playing the same role. Figure |2}; shows a CC example. 

The NuMI beam also includes a 2% contamination of Vg. 
These Vg interact identically to the Vg from oscillations and 
thus constitute a background to the v^ ^ Vg oscillation measure¬ 
ment. However, their rate is low and their energies are some¬ 
what higher. Figure [3] illustrates the energy differences among 
all the event classes before any selection cuts have been applied. 

Since the Vg CC signal falls within a known energy range, 
we can safely remove lower and higher energy events up front. 
For all figures and tables that follow, we require events to have 
reconstructed visible energies between 0.5 GeV and 4 GeV. 


strips r _ 

logX = X J 


P{ai\A)P{bi\A)dA 


( 1 ) 


where a, is number of photoelectrons registered by the f* strip 
of event A, bi is the same for event B, P{n\A) is the Poisson 
probability of observing n given mean A, and the sum runs over 
all strips active in at least one of the events. A higher logX 
for a pair of events means a better match. Before X is calcu¬ 
lated, the events, which in general occur in different parts of 
the detector, are spatially aligned by shifting them so that their 
charge-weighted mean strip positions, rounded to the nearest 
strip, overlap. 

In the MINOS metric X, displaced energy depositions in the 
two events do not get their charges directly compared. To obtain 
good matches for a trial event, the library must be large enough 
to span minor variations in active strip positions for nominally 
equivalent events. This is possible in MINOS given the limited 
spatial resolution of the detectors for Vg CC events. That is, 
the library can be expected to give reasonable coverage of all 


3 








possibilities. Requiring exact charge agreement across the ~20 
active strips, though, would be combinatorically overwhelming. 
The Poisson factors take care of this, with acceptably different 
charges able to contribute appropriately to the match score. 

The NOvA detectors are significantly more finely-grained 
than those of MINOS. This makes event discrimination easier in 
principle since more details are visible, but it makes the above 
matching metric impractical. It is much less likely that “equiva¬ 
lent” activity in the trial and library events will fall on the same 
cells. What is needed is a matching metric that rewards activity 
in nearby cells without requiring them to lie directly on top of 
one another. A library event identical to the trial event should 
still be a perfect match, but events with similar charges offset 
by a cell or so should still score well. 

The metric we use draws its motivation from electrostatics. 
Two Coulomb charge distributions of similar shape, but with 
opposite signs, will have a low electrostatic potential energy 
when overlaid and examined together, as the attraction between 
the opposite signed charges counters the internal repulsion of 
the like-signed charges. Two overlaid charge distributions with 
dissimilar shape suffer the internal repulsion but lack the benefit 
of mutual attraction, leading to a large potential energy. Given 
the electrostatic analogue to what follows, we use “energy” to 
refer to the LEM match score for the remainder of the article 
unless otherwise stated. Lower energies correspond to better 
matches. 

The match energy is defined as 
E - Ea + Eb + Eab , (2) 


where Ea is the self-energy (repulsion) of event A’s charges. Eg 
is the self-energy of event B’s charges, and Eab is the (negative) 
energy due to the the A/B attraction. The charges are taken to 
be the recorded energy depositions in the NOvA cells. Treating 
the electrostatic analogue as exact for a moment, the self-energy 
terms are given by 


Ea = 


1 cells 1 cells / / 

_ I aiUj 77 _ ^ 

ij 


(3) 


with a, (hi) the recorded deposition in the cell of event A 
(event B) and with rij the distance between cells i and j. The 
rij = 0 case is handled again with an electrostatic analogue by 
distributing all charges uniformly across their individual cells. 


(See Appendix A.2 ) 

The interaction term is given by 


matched pair with charges far away from one another will have 
large energy E ^ Ea + Eb- 

Eq. (in can be recast in terms of one set of charges embedded 
in the field of the other: 

cells 

Eab - - ^ aiVi 

i 

cells 7 

where V, = 5^ — . 

Y 

The advantage of this formulation is that V can be precalculated 
for each trial event, along with the self-energies of the trial and 
library events. When matching against a large number of li¬ 
brary events using ©, the complexity is linear in the number of 
charges rather than requiring a double sum over both trial and 
library charges. 

3.2. The matching metric in NOvA 

While the NOvA matching metric is inspired by electrostat¬ 
ics, there is no reason to expect that the precise form above will 
yield the best sensitivity. We incorporate the following gener¬ 
alizations. 

• Above, rij is calculated as the Euclidean distance in terms 
of the number of planes and number of cells Ac,;. 
However, NOvA events are boosted forward and cover 
many planes longitudinally but relatively few cells trans¬ 
versely, so we assign different relative importance to sep¬ 
arations in the two directions. 

• The r ' falloff with distance is generalized to r^“. 

• The importance of larger charges relative to smaller ones 
is adjusted by raising all charges to a power (3. 

The resulting form of the matching metric still follows Eq. (l2]i, 
but the self-energy and interaction terms are now given by 


cells 

i j 

H cells 

eb = (7) 

i j 

cells 

Eab = 

(8) 


(5) 

(6) 


cells 


a 


(4) 


Before evaluating this sum, the events are globally aligned with 
one another according to a separately reconstructed interaction 

vertex Q 

A perfect match, in which events A and B have identical de¬ 
positions in identical cell positions, would yield £=0. A poorly 


'Alignment by charge-weighted mean cell position was also studied and 
gives similar classification performance. 


with the transfer matrix Tjj and field f/, given by 


Tij = 


(Apl Acl\ 

cells 


o-^ 


-a/2 


Ui = 


The electrostatics version is recovered by setting 

CTp - CTc - a - p - I . 


(9) 

(lO) 


(ll) 
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Figure 4: Example of LEM matching. On the left is a trial Ve CC event, on the right the best match found. The central panels shows the potential U in which the 
library events are placed in order to calculate the match energy. The upper panels show one view, and the lower panels show the other. 


We ran toy experiments with different values of these parame¬ 
ters and calculated a figure-of-merit for each to optimize per¬ 
formance. The parameters chosen were: 


o-p 

= 0.286 

(12) 

o-g 

= 0.095 

(13) 

a 

= 0.25 

(14) 

P 

= 0.5 . 

(15) 


The first two parameters validate the intuition that transverse 
differences should be considered more significant than longitu¬ 
dinal ones. The third parameter specifies a 1 / -^ falloff with 
distance, slower than the electrostatic analogue. For fi, note 
that the simple presence or absence of activity in a cell con¬ 
veys information regardless of its charge. Having 0<yS<l moves 
the metric towards this binary “on/off” interpretation and away 
from a charge-proportional weighting. 


4. The library 

The library consists of 77M simulated neutrino events, of 
which 18M are signal Vg CC events, 29M are background 
CC and NC events, and 30M are 7r‘’-enriched NC background 
events. Each trial event that LEM classifies is compared to these 
77M events to find the 1,000 library events that are most sim¬ 


ilar to it, as quantified by the metric above0 Eigure |4] shows 
an example trial event along with its event potential U and its 
best-matched library event. 

The library events are generated ahead of time using the full 
NOvA Monte Carlo simulation chain including realistic neu¬ 
trino flux, cross sections, and detector components. The flux 
is calculated using a ELUKA/ELUGG implementation of the 
beamline elements 101, the neutrino interactions are simulated 
by GENIE |0], and particle propagation through the detector 
geometry is handled by GEANT4 Igt]. Simulated energy depo¬ 
sitions in the liquid scintillator are converted into expected sig¬ 
nals by NOvA electronics and data acquisition simulation code. 
The registered signals are corrected for light attenuation in the 
cells’ fibers using standard NOvA calibration procedures. 

NC events containing neutral pions are the dominant mis- 
identification background owing to the electromagnetic show¬ 
ers from jjp —» jj. Thus, we supplement the base background 
library sample with a 7r°-enriched library sample. To build this 
enriched sample, we apply a cut that selects out only those neu¬ 
tral current events with a tt® present in the final state as reported 
by GENIE. 

The library events are generated according to the expected 

flux (for background) or a 100% v^u —> Vg transmutation (for 
signal), without regard to any actual probabilities for neutrino 


^This statement is modified in Sec. 17.11 when we discuss speed optimiza- 
tions. 
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flavor change. Oscillations are introduced into the library 
weighting. This is discussed in Sec. |5] below, 
describes the oscillation probabilities used. 

While increasing the library size beyond the 77M events 
would provide incremental improvement in classification per¬ 
formance, we observe that these gains enter logarithmically 
with the number of library events once the library is sufficiently 
large. In an earlier version of the algorithm, we found that dou¬ 
bling the library size provided only 1% gain in physics sensi¬ 
tivity. In light of the computational requirements discussed in 
Section |7] additional library events are not worthwhile for our 
application. 


where n is the match index, £„ is the energy of the n* best 
match for the trial event, and £1000 is the energy of the final 
(1000*) best match. The optimized values used for A and y in 
NOvA are 

A = 6.67 (17) 

r = 10. (18) 

The typical ratio of weights Wiooo/'^i ~0.1%, indicating that 
the most important matches are captured within the first thou¬ 
sand. 

In practice, the weight must also include the oscillation prob¬ 
abilities alluded to earlier; 


later by event 
Appendix A.3 


4.7. Event flipping 

To good approximation, flipping an event transversely in one 
or both views produces an equally valid event. We use such 
flipping to effectively quadruple the size of the library when the 
matching is performed. Each library event is used in each of the 
four possible configurations, and the best of the four is retained. 
This symmetry is not quite perfect in the NOvA detectors. At¬ 
tenuation in the readout fibers leads to subtly different charge 
resolutions and threshold effects on transversely opposing sides 
of an event, and NuMI neutrinos at the Far Detector enter at a 
3° upwards angle. Nevertheless, the best-scoring matches come 
from the four possible flipped configurations with nearly equal 
probability: 26% from unflipped events, 50% from events with 
either one of the two views flipped, and 24% from events with 
both views flipped. 

5. Decision tree 

As library size increases, the fraction of an event’s best 
matches that are truly signal tends toward the probability that 
the trial event itself is signal. Further, all of the information 
available in the trial event is used when determining this prob¬ 
ability. It is in this sense that LEM is optimal. 

For a library of finite and practical size, though, this signal 
fraction alone does not contain the full information extractable. 
Other statistics constructed from the details of the best matches 
may, for example, indicate that the matches are drawn from an 
area of sparse library coverage and are thus less reliable. The 
most powerful approach given a finite library is to construct 
several statistics describing the matches and to feed these into 
one of the standard multivariate analysis techniques to extract 
the final classifier. In LEM, five variables are constructed from 
the 1,000 best library matches and are used as inputs to a deci¬ 
sion tree, along with the calorimetric energy of the trial event 
as a sixth input. 


Wn = W'P° 


(19) 


where is the oscillation probability of match n, as described 
inP ’ 


Appendix A.3 


All sums below that are indexed by n run over the match list. 
For notational convenience we also define W = This 

weighting scheme is used for all five quantities formed from 
the best-match list. The first is the weighted fraction of signal 
matches. 




( 20 ) 


where this sum includes only those terms due to signal matches. 


5.2. Mean hadronic y 

Signal events in which the outgoing electron carries only a 
small fraction of the incident neutrino’s energy will look very 
much like NC background events. The kinematic quantity y (or 
rather, 1 -y) measures this fraction: 1 -y = KejKv, where we’ve 
used Kg and as the outgoing and incoming lepton energies 
to avoid confusion with the match energies E. If a trial event 
matches well to signal events with high y, this can suggest that 
the trial event is in fact a high-y NC event. A second input is 
the mean y for the best matches: 

<y> ^ ^'Zj^nyn ■ ( 21 ) 

n 

5 . 3 . Mean matched charge fraction 

Matched charge fraction is an independent measure of the 
quality of the library matches, separate from the match energy. 
For each trial/match pair, this is the quantity of charge that has 
a counterpart on identical cells in the two events divided by the 
total charge in the two events: 


5.7. Weighted fraction ofsignai matches 

The basic quantity measuring what fraction of the best 
matches are signal events can be improved upon by weighting 
up the truly best matches over the lesser ones when calculating 
the signal fraction. We use the weighting 

< = exp[-d[-^] ] , ( 16 ) 

\ \C1000/ / 


2 min(a/, bi) 

Zf'"(ai + bd 


( 22 ) 


The weighted average of the matched charge fraction over all 
the matches yields the next input: 


</e> 




(23) 
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Figure 5: The six decision tree inputs described in the text. The red curves show the distribution of signal events. The blue, black, and magenta curves show the 
distributions of neutral current, v,, CC, and intrinsic CC backgrounds respectively. The signal and neutral current background are normalized to equal area. 
The other backgrounds are to the same scale as the neutral current curve. The signal distributions for fsig and /e^r are very sharply peaked at 1, so we have plotted 
these quantities as tanh“*(/sig) and tanh“*(/enr) to keep the signal and background curves visible on the same vertical scale. 


5.4. Match energy difference 

This quantity measures whether the signal or background 
matches are the better matches on average. It is the difference 
of the weighted mean energy of each class of matches: 

D = —^-—^- (24) 

2jm, sig tTrt 2j«, bkg tfn 


5.5. Enriched fraction 


The final match list quantity, similar in construction to ffg, is 
the weighted fraction of signal matches present among the sig¬ 
nal and 7r*^-enriched matches (i.e., excluding the non-enriched 
background), 


/eni- = 


2n, 


sig 'Tji 


enr "f 2 


n, sig kkn 


(25) 


5.7. Choice of a decision tree, and figure of merit 

There are many multivariate techniques capable of combin¬ 
ing these six input quantities into a single classifier output. We 
investigated artificial neural networks, support vector machines, 
and decision trees. An ensemble decision tree yielded the best 
performance of the approaches tried. One problem with other 
techniques is that the figure of merit (f.o.m.) that, for example, 
artificial neural network training aims to minimize is the mean- 
squared-error of the classifier variable c: 

sig bkg 

f.o.m. = 2(l-c)2 + ^c2, (27) 

i i 

where the sums run over the signal and background training 
samples. However, the figure of merit relevant to an experiment 
measuring the magnitude of a signal excess s over a background 
b with Poisson fluctuations is 

f.o.m. = . (28) 

yjs + b 


5.6. Total calorimetric energy 

NC backgrounds skew heavily to low visible energy thanks 
to the energy removed by the exiting neutrino. The sum of all 
depositions {a,} recorded in the trial event. 


cells 


^cal ~ ^ ^ ' 


(26) 


If events are binned according to, say, the classifier output, the 
generalization is simply to sum in quadrature the significances 
in the individual bins: 


f.o.m. 



(29) 


While training a decision tree classifier, if the sample is divided 
at each step into subsamples 1 and 2 so as to maximize 


is included as a final input so that the classifier knows the prior 
expectations of signal and background. 


s 


2 

1 


Si +bi 


+ 


s 


2 

2 


^2 + ^2 ’ 


(30) 
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Figure 6: The distribution of the LEM output variable for CC signal events 
(red) compared to the background components: neutral current (blue), CC 
(black) and intrinsic beam Vg CC (magenta). In order to make the details 
in the signal-like region visible, the )^-axis truncates much of the background 
peak. 95% of neutral current events and 98% of charged current events 
have LEM <0.15. The distributions are scaled to a nominal 3-year NuMI ex¬ 
posure d of 1.8x10^® protons-on-target. 


then the performance of the full classifier is trivially optimized 
with respect to the figure of merit in Eq. ( l29l l. 

The hnal classiher output is a voting ensemble of 1,000 de¬ 
cision trees each trained on a randomly chosen half of the full 
training sample. The ensemble technique protects against over¬ 
training, a feature that we conhrmed by evaluating the classiher 
performance on independent control samples. 


6. Classification performance 


Figure |5] shows the distribution of the six input variables for 
all event classes in the NOvA ^ Vg analysis. Figurej^shows 
the final LEM classifier output. Figure Q shows the signal effi¬ 
ciency and purity obtained with various cuts on the LEM out¬ 
put. All curves come from Monte Carlo simulation of the ex¬ 
pected NOvA data set. We choose the cut on the LEM output 
variable that maximizes the figure-of-merit in Eq. (l28l l. When 
applying LEM in a full experimental setting, one can fit the out¬ 
put distribution to gain additional discrimination power. 

Table[T]shows the expected number of signal and background 
events selected by the optimum LEM cut. The signal efficiency 
is 55% for a background mis-identification rate of 2.0%. The 
muon track of CC events keeps their mis-identification rate 
particulary low. Background beam Vg events are selected with a 
lower efficiency than signal Vg events. This is possible due to the 
different underlying energy spectra of the two classes. As there 
is no absolute metric by which to judge the performance of the 
LEM classification algorithm described here, we note simply 
that the performance shown is excellent for the physics goals of 
NOvA il. 



Figure 7: Efficiency and purity of the Vg candidate sample selected by LEM 
for different cut positions. The dashed lines are curves of constant f.o.m. = 
sj Vi + b, and the solid circle indicates the result of the optimum cut. 

7. Computational optimization 

7.1. Speed 

While each individual energy calculation can be performed 
very quickly, classifying a single event takes some time given 
the large size of the library. For the NOvA application, a sin¬ 
gle event must be treated in a second or so, which is the time 
scale required by other steps already performed during NOvA 
event processing. Without specialized hardware to run the in¬ 
ner loop, techniques to manage the LEM matching time focus 
on reducing the number of energies that need to be calculated. 

We achieve a significant speed-up by introducing a library 
“index”. If trial event A matches well to library event B, A will 
likely match well to other library events that are, themselves, 
good matches to B. Similarly, if A and B match poorly, then A 
will likely match poorly to library events similar to B. 

A library index is formed by drawing 10,000 events uni¬ 
formly from the full library and matching each of these to the 
full library. For each index event, a list of its 1,000,000 best- 
matched library events is saved. This process happens ahead 
of time, at library creation. When a trial event is classified, it is 
compared first to the 10,000 index events to find the single best¬ 
matching index event. The trial event is then compared only to 
the IM sibling events of that index event, reducing the total 
number of energies calculated per trial event from 77,000,000 
to 1,010,000 - a significant speed improvement that takes the 
per trial matching time from 97 s down to 1.7 s on a 2.3 GHz 
AMD Opteron processor. Empirically, we find that 85% of the 
trial event’s “true” one-thousand top matches are captured with 
this indexed approach, and we find no noticeable degradation in 
the physics performance. 

7.2. Memory 

The speed optimization above is what allows the use of a 
77M event library. However, such a large library strains mem¬ 
ory resources. The full library is too large (~53 GB each for 
the library and index) to read from disk for each event, yet it 
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Ve signal 

Tot. bkgd. 

NC 

Vfi CC 

Beam Vg CC 

No selection 

105 

1332 

734 

573 

25 

LEM 

58 

27 

14 

4.6 

7.9 

Efficiency 

55% 

2.0% 

2.0% 

0.8% 

32% 


Table 1: Number of events expected in each event category initially and again after an optimal LEM cut assuming a nominal 3-year NuMI exposure of 1.8 X 10^* 
protons-on-target. The background is shown both as a total as well as broken down into NC, CC, and intrinsic beam Vg CC components. The bottom row shows 
the efficiencies for selecting events in each category. The “no selection” row and the efficiencies derived from it count only those events with reconstructed visible 
energy between 0.5 GeV and 4 GeV. 


is larger than the typical per-core memory allocation on grid 
computing nodes. 

Thus, the library is converted from its original high-level for¬ 
mat into the memory representation used by a running job. This 
representation includes the self-energy of each event. The con¬ 
version inflates the library slightly to 131 GB, but the advantage 
is that it can now be shared between running processes. Each 
parallel matching job uses the mmapO system call to make the 
contents of this hie visible in its address space. The mapping is 
marked read-only, so the kernel shares the pages between all the 
running processes. For example, on a 64-core server, the mem¬ 
ory requirement to run 64 matching jobs is still only 131 GB, 
equivalent to an unshared 2 GB per core. In case of memory 
pressure, the kernel will discard pages, knowing that they can 
be retrieved from disk (that is, the library hie essentially acts 
as swap space) although this will signihcantly impact perfor¬ 
mance. 


8. Other information available in the match list 

In addition to signal-or-background classihcation, the de¬ 
tailed truth information available in the list of best matches al¬ 
lows other information about the trial event to be inferred. One 
could extract probabilities for different interaction modes, the 
inelasticity, and so on, without requiring any independent re¬ 
construction. An application that has been pursued is the esti¬ 
mation of the incident neutrino energy for Ve CC events. Simply 
by averaging the true neutrino energies of the best signal library 
matches and calibrating the resulting estimator, we achieve an 
energy resolution of 8.8% on signal events selected by the os¬ 
cillation analysis, competitive with other energy estimators in 
NOvA. 


9. Summary 

The Library Event Matching algorithm compares input trial 
events to a large library of known events using all the infor¬ 
mation available, making LEM an optimal classiher given a 
sufficiently large library. The NOvA implementation of LEM 
has demonstrated excellent performance in separating Ve sig¬ 
nal from the key backgrounds, and a few simple optimizations 
have maintained practical computational requirements despite 
the large number of library events used. Within the NOvA con¬ 
text, the LEM technique has potential applications from recon¬ 
struction of the hadronic system to the event energy measure 


described above. More broadly, LEM can be applied to com¬ 
pletely different particle detectors or imaging systems in an ar¬ 
ray of helds and industries, wherever one needs to classify hne- 
grained images of objects whose visual characteristics vary in 
known ways. 
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Appendix A. Additional technical notes 

A few technical notes are included in this Appendix so as not 
to break up the discussion in the main text. 

Appendix A.l. Relation to other classification techniques 

If /sig and /enr Were calculated unweighted, then those vari¬ 
ables would be k-nearest-neighbors classihers, albeit with very 
large input vectors. With the weights w„ applied, they act as 
kernel density estimators. Note that 

= ^X(af-bf)Tq(a^^-l/J (A.l) 

is a metric for the space of possible event images. That is, dis¬ 
tances dehned in this way obey the triangle inequality. For a 
Gaussian kernel in this space one would expect w„ ~ exp(—E), 
which contrasts with the optimal value of y = 10 found in prac¬ 
tice. Similarly (y) is an estimator for the true value of y using 
the same kernel. 

Methods exist to efficiently hnd nearest-neighbors in general 
metric spaces without having to rely on heuristics such as the 
library index in Section lTTl Testing of a vantage-point tree M 
indicated its performance was affected by the curse of dimen¬ 
sionality. A large fraction of the nodes would have to be entered 
during a typical search. 


Appendix A.l. Energy calculation when rq — 0 

The transfer matrix element Tq as written in Eq. (|9]l di¬ 
verges when i = j since Apa and Ac,-,- are zero. Thus, for nearby 
cell pairs (Apij < 5 and Ac,j < 5), the energy calculation is per¬ 
formed as if the charge is distributed uniformly over each cell, 
with 


1111 


Tij = 


rij(x,y,u,v •r” dxdydu dv, 


0 0 0 0 


(A.2) 
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where (x, y) and (m, v) scan over the areas of cells i and j and 
where here is a generalization of the discrete distance used 
in the main text: 



For more distant pairs the simplified form of the transfer matrix 
given in Eq. (|9]l is sufficient. 

Appendix A. 3. Neutrino oscillation weights 

The retained matches are weighted according to Eq. (fT9l l. 
which includes the probability for flavor oscillation. The prob¬ 
abilities used are 



P(Ve -^Ve) = 0 , 


(A.7) 


where L = 810 km is the oscillation baseline, E is the neutrino 
energy in GeV, and the oscillation parameters are taken to be 


013 = 9.2° 

023 = 38.5° 

An? = 2.35 X 10^^ eV^ . 


(A.8) 
(A.9) 
(A. 10) 


These oscillation probabilities are first-order approximations to 
the full expressions. This is both for practical reasons - the 
second-order effects are poorly determined and are in fact what 
NOvA aims to measure - and because there is no requirement 
for the library have any particular distribution of events in it. 
The second order effects can pull the probabilities higher or 
lower, making this weighting a reasonable middle ground for 
the library. The library is also made devoid of intrinsic Ve from 
the NuMI beam by setting that survival probability to zero. 
The overall prefactor on the —> Ve (signal) line relative to 
the background lines actually does not enter in practice since 
the signal, background, and 7r‘’-enriched background classes are 
scaled to have equal total weight in the library. 
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